Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuing cross-check of SMIRFF parameters against parm@frosst #44

Merged
merged 8 commits into from
Jul 8, 2016

Conversation

davidlmobley
Copy link
Contributor

Continuing the work needed for #37 which was begun in #38 . Now, with the IDIVF flag, we've made real headway on c100 from AlkEthOH, but we're not quite there yet:

Compared torsion involving atoms 'C2 - C6 - C5 - H13' with that involving atoms 'C2 - C6 - C5 - H13':
Exception: Error: (proper) PeriodicTorsionForce (atoms 1, 5, 4, 22) strength (periodicity 3, phase 0.000000) has values of 0.650844 and 0.669440, respectively.

I'm out for a while, but will get back to this later or tomorrow I hope.

@davidlmobley
Copy link
Contributor Author

c100 is a challenging test with lots of torsions. I'll do some simpler tests to work back up to this:

  • c1284, methanol, passes
  • c1285, methoxymethane, passes
  • c1266, ethoxyethane, fails (H5-C2-C4-H10 torsion has strengths of 0.650844 and 0.209200)
  • c1008, butane, fails (H7-C3-C4-H10 torsion has strengths of 0.627600 and 0.209200)
  • c901, propane, fails (H4-C2-C3-H8 torsion has strengths of 0.6276 and 0.209200)
  • c38, ethane, fails (H3-C1-C2-H6 torsion has strengths of 0.6276 and 0.209200)

OK, so that leads me to the simple ethane/butane/propane case as a starting point for troubleshooting, as it's looking like there's a consistent difference with one very identifiable torsion. It's different by a factor of 3! I'll work on that.

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 7, 2016

Oh, and this is THIS torsion:

PeriodicTorsionGenerator:

                               [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4] :        9 matches

@davidlmobley
Copy link
Contributor Author

Fixed that one - I had a typo in idivf for that torsion.

@davidlmobley
Copy link
Contributor Author

So now that fixes the few cases I just posted except c1266, ethoxyethane, where I have values of 0.650844 and 0.627600 for the H5-C2-C4-H10 torsion. Here are the torsions which actually match anything:

PeriodicTorsionGenerator:

                               [a,A:1]-[#6X4:2]-[#6X4:3]-[a,A:4] :       18 matches
                               [a,A:1]-[#6X4:2]-[#8X2:3]-[!#1:4] :        6 matches
                                 [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4] :       12 matches
                             [#6X4:1]-[#6X4:2]-[#8X2:3]-[#6X4:4] :        2 matches
                               [#8X2:1]-[#6X4:2]-[#6X4:3]-[#1:4] :        6 matches

I'm imagining there is a typo in one of these. Presumably it's the generic at the top or the [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4]. Oddly, the 0.6276 value is the same as we just fixed for the [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4] in propane, so that's probably what's coming up in OpenMM. Weird that there is a different value for AMBER. Maybe THIS is the "electronegative atom" (H2) bug in parm@frosst? Need to check atom types for this molecule, but I have to dash off.

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 7, 2016

@cbayly13 - I could use your input on this one. Right now we seem to not properly handle all the torsions involving H1, H2, and H3 in SMIRFF.

This particular instance just noted above is ethoxyethane, chains compound c1266 from AlkEthOH, and the issue seems to be that EITHER you've done something smart OR made a mistake. The issue in question is the HC-CT-CT-H1 in this molecule. Here's the problem, I believe:

So, the key point is, your SMIRFF mockup applies a specific torsion to H*-CT-CT-H* which is not applied by parm@frosst.

Did you do this on purpose and deliberately deviate from parm@frosst, or do you need to (a) amend [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4] for HC-CT-CT-HC so that it doesn't match HC-CT-CT-H1 (and -H2 and -H3), or (b) override this match by adding specific HC-CT-CT-H1 (and -H2 and -H3) torsions later in the file which override the [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4] case by re-applying the generic torsion?

I think (b) is probably easier and the problem could be solved if we added (OK, maybe I'll botch the SMARTS, but I'll try) an electron withdrawing group attached to the carbon to re-apply the generic, so maybe something like
[#1:1]-[#6X4:2]-[#6X4:3](-[#8])-[#1:4] further down in the parameter file, then provide the 1.4 kcal/mol barrier height with idivf of 9 again, to override the H*-CT-CT-H* case you're currently applying. Does this sound right to you?

@cbayly13
Copy link
Collaborator

cbayly13 commented Jul 7, 2016

@davidlmobley , your option (b) is correct and your smarts string is also correct (I checked it). We are building in a parm99 bug here... let's make sure we tag it for removal later.

Note that this bug in parm99+parm@Frosst all because of atom typing. That said, I would hope the energy difference is not too large because "barrier/idivf" is pretty similar: for the generic 1.40/9=0.1555, and for the specific HC-CT-CT-HC, 0.15/1=0.1500 so it shouldn't be so horrible.

@davidlmobley
Copy link
Contributor Author

@cbayly13 :

your option (b) is correct and your smarts string is also correct (I checked it).

Excellent, I will "fix" that.

We are building in a parm99 bug here... let's make sure we tag it for removal later.

Just to clarify - what exactly is the bug in parm99 here? Should it be that the HC-CT-CT-HC torsion also applies to all other non-typed hyrogens, i.e. as per your SMARTS [#1:1]-[#6X4:2]-[#6X4:3]-[#1:4]?

That said, I would hope the energy difference is not too large because "barrier/idivf" is pretty similar

Yes, this is a small difference so it doesn't seem important - but when trying to make sure the energies are the same, it certainly poses a problem. :)

@davidlmobley
Copy link
Contributor Author

For the record, the bug is that, well, there's no reason that parm99/parm@frosst should be applying the generic X-CT-CT-X torsion to all hydrogens EXCEPT HC-CT-CT-HC; it's just that someone forgot to give other hydrogens the same treatment. It's a small parameter difference, however.

@davidlmobley
Copy link
Contributor Author

OK, I just pushed that "fix" (reproducing the minor parm99/parm@frosst bug in the SMIRFF xml, with a suitable comment noting it) and now I can reproduce parm@frosst energies with SMIRFF for this case, ethoxyethane. This means I now pass for:

  • c1284, methanol
  • c1285, methoxymethane
  • c1266, ethoxyethane
  • c1008, butane
  • c901, propane
  • c38, ethane

Returning to c100, I still fail there, with the C2-C6-C5-H13 torsion, which is CT-CT-CT-H1, naturally. Here, barrier heights are 0.650844 and 0.669440 kJ/mol, respectively, for AMBER and OpenMM.

Probably this is sort of a variant of the last issue, in that I bet a generic is being used for the H1 which shouldn't be in parm@frosst, and the suitable hydrogen-containing version is used here because our SMARTS are more general. Checking.

I'm also adding a second .ffxml file here which will provide "actual parm@frosst" if I can, for energy checking, versus "correct parm@frosst" which is the one without these bugs (i.e. our original once corrected).

@davidlmobley
Copy link
Contributor Author

Yes, indeed, this seems to be the same type of issue. Here we have [#1:1]-[#6X4:2]-[#6X4:3]-[#6X4:4] which covers H_-CT-CT-CT where H_ is any type of H, but parm99/parm@frosst use HC-CT-CT-CT so the H_-CT-CT-CT torsion (where H_ is other H types, such as H1, H2, H3) picks up the value for the generic X -CT-CT-X.

I'll introduce this bug as well to the special xml file for energy comparisons. Sheesh. While THESE do not result in a big difference in parameters, I wonder how many crazy things like this there are for more complicated molecules where a completely unintended generic parameter gets applied because someone created derivative types without going back and ensuring they were using the right torsions. It's horrifying.

I just introduced this via the SMARTS [#1:1]-[#6X4:2]-[#6X4:3](-[#8])-[#6X4:4] and now I get past that torsion, but now I get stuck on the C4-C7-C3-H7 torsion, which is CT-CT-CT-HC. Now I'm getting 0.669440 and 0.650844 for AMBER and OpenMM respectively, which is the reverse of the above. I think I've overshot now and somehow assigned a generic value back to the CT-CT-CT-HC torsion here, which I didn't want to do. I need to check this, but I have to dash off so I'll revisit this soon.

@davidlmobley
Copy link
Contributor Author

davidlmobley commented Jul 8, 2016

I'd botched the SMARTS and wasn't matching the right torsions exclusively. I've now fixed it, and those torsions pass for c100.

Next, I hit an issue on O1-C5-C6-O3 for the same molecule, which is OH-CT-CT-OS, where force constants were way off. We had a typo in the XML file where 1.175 had become 0.175. This is now fixed.

Now c100 passes.

Once the tests pass, I should merge this PR and then start a new PR to:

  • test on multiple molecules
  • update the automated testing framework to do energy comparisons on some molecules

etc.

@davidlmobley davidlmobley merged commit f8c71f0 into master Jul 8, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants